Dis-assembling Programs

This chapter describes the purpose and usage of the disassembler, available by using the `-d' option from the command-line. The dis-assembler has several uses:

Provides insight into the internal operation of the virtual machine.
Provides an excellent tool for debugging new additions or changes to the operation of the virtual machine.

However, the dis-assembler will not normally help users debug their programs. It is a tool for working on the executable, rather than a tool for working with the language. The dis-assembler produces a readable representation of the virtual-machine codes. It is recommended that the the `-q', `-l', and `-n' be used in conjunction with the `-d' option, since this combination will produce more readable output. It is almost essential, unless you are extremely patient, to use the `-l' option, which will prevent loading of all the R-files in the rlib directory. Failure to use `-l' in conjunction with `-d' will force you to wait an extremely long time while the op-codes for each library function are printed to the screen. The `-n' option merely suppresses the generation of file and line number information, and makes the output more readable. For example, lets invoke with the correct options and look at the output from a simple expression:

augerinn:~$ rlab -qrldn
Welcome to RLaB. New users type `help INTRO'
RLaB version 0.97d beta Copyright (C) 1992, 93, 94 Ian Searle
RLaB comes with ABSOLUTELY NO WARRANTY; for details type `help WARRANTY'
This is free software, and you are welcome to redistribute it under
certain conditions; type `help CONDITIONS' for details
> 1 + 3 - 2  
   1: push constant
   2: 1
   3: push constant
   4: 3
   5: add
   6: push constant
   7: 2
   8: sub
   9: print
  10: stop
        2

What you see is the internal representation of the expression `1 + 3 - 2'. The parser generates the op-codes and stores them internally for execution by the virtual-machine. The virtual-machine in is a internal stack-machine which sequentially executes integer instructions. The instructions, or op-codes, are created as each statement is read and parsed. The op-codes are then executed as soon as a valid end-of-statement is reached. For user-functions, the op-codes are stored as variables, then executed each time the variable is used in a function evaluation. Therefore, a function need only be parsed the first time it is used. Now let us look at the use of a variable assignment. The only real difference between this and the previous example is the use of the assign operator.

> a = 3
   1: push var
   2: a
   3: push constant
   4: 3
   5: assign
   6: print
   7: stop
 a =
        3

The use of a trailing semi-colon makes a slight change - the print is changed to a pop clean.

> a = 3;
   1: push var
   2: a
   3: push constant
   4: 3
   5: assign
   6: pop clean
   7: stop

Combining the operation and assign operators is just as you would expect, with the result being calculated, then assigned to the left hand side variable.

> b = a - 2
   1: push var
   2: b
   3: push var
   4: a
   5: push constant
   6: 2
   7: sub
   8: assign
   9: print
  10: stop
 b =
        1

The stack supports two basic operations, push and pop, as well as some other operations that are unique to . To fully understand the op-codes it is necessary to look at the source code; an understanding of ``YACC-like'' BNF grammars, and C is also required. If you understand stack-machines many of the operations can be deduced. However, some of the operations are confusing when expressed in op-codes. For example a function call with expressions embedded in the argument list:

> sqrt ( (1+2)*3 )
   1: push var
   2: sqrt
   3: push constant
   4: 1
   5: push constant
   6: 2
   7: add
   8: push constant
   9: 3
  10: multiply
  11: function call
  12: 1
  13: print
  14: stop
        3

This example is somewhat confusing because sqrt seems to be treated like an ordinary variable; it is, and it should be. The `function call'' op-code does not appear until the 11^th op-code, well after the variable sqrt has been pushed on the stack. You probably wouldn't figure out that the number 1, the 12^th op-code, is the number of function arguments unless you had looked at the grammar for function evaluations. In conclusion, the `-d' command-line option is primarily available since is a distributed as source code, It does not make much sense to ``hide'' features in programs distributed this way. The dis-assembler may never be fully documented, certainly not until after version 1.0 of has been released. Most users should never need to use this option.